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ABSTRACT 

The  subject  of  this  paper  is  the  adjustment  ("curve  fitting")  of 
data  by  the  least  squares  method.  Based  on  general  formulas,  which  are 
derived  in  this  paper,  a  new  algorithm  for  the  least  squares  method  is 
established.  It  can  be  applied  to  cases  where  more  than  one  observable 
contain  observational  errors  and  where  the  postulated  relation  between 
the  observables  is  non-linear.  Also  taken  into  account  are  accuracies 
of  the  observations  and  correlations  between  the  components  of  each 
observation  vector.  New  formulas  are  derived  for  the  estimation  of 
the  variance- covariance  matrix  of  the  fitted  parameters.  It  is  shown 
that  the  conventionally  used  estimation  formula  is  theoretically  wrong 
except  for  very  limited  special  cases.  Numerical  tests  of  the  algorithm 
demonstrate  its  accuracy  and  exceptional  convergence  characteristics. 

They  also  show  that  the  conventional  estimate  of  the  variance- covariance 
matrix  of  the  parameters  is  a  very  bad  approximation  to  the  theoretically 
correct  estimate  derived  in  this  paper. 
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1.  INTRODUCTION 


Mathematical  models  of  causal  relations  between  measurable  variables 
are  often  expressed  in  form  of  functional  equations  between  the  variables. 
Typically  the  functionals  contain  a  number  of  parameters,  the  values  of 
which  are  determined  such  that  the  corresponding  relation  between  the 
variables  agrees  closely  with  experiments.  (In  case  of  only  two  variables 
this  is  often  called  "curve  fitting".)  A  most  common  technique  for  the 
determination  of  the  parameters  is  the  method  of  least  squares.  In  this 
paper  we  shall  discuss  an  algorithm  for  a  fairly  general  type  of  the 
least  squares  method,  applicable  to  partially  correlated  observations *, 
non-linear  functionals  and  finite  residuals.  Particularly  for  cases 
with  finite  residuals  the  theory  of  the  least  squares  method  has  not  been 
sufficiently  developed.  On  the  other  hand,  even  linear  curve  fitting 
with  errors  in  both  variables  cannot  be  correctly  treated  under  the  usual 
assumption  of  negligible  residuals. 

Numerical  techniques  for  problems  similar  to  those  considered  here 

have  been  suggested  by  several  authors.  Non-linear  problems  with  cor- 

1**  2 

related  data  have  been  treated  by  Brown  (1955)  ,  Tienstra  (1956)  ,  and 
Grossmann  (1961)  .  These  authors  linearize  the  constraint  functionals  by 
using  a  truncated  Taylor  expansion  and  treat  the  problem  as  one  with 
linear  constraints.  As  will  be  shown  later,  such  an  approach  generally 
will  not  give  a  correct  least  squares  solution,  unless  an  iteration  is 
included  in  the  algorithm.  Only  Tienstra  mentions  the  possibility  of  an 
iteration,  but  he  does  not  derive  the  necessary  formulas. 

Not  correlated  data  with  non-linear  constraints  are  treated  in  seve¬ 
ral  textbooks,  e.g.,  Demming  (1944) Arley  §  Buch  (1950)^  and  Wolberg 
(1967)6.  None  of  these  authors  consider  the  necessary  iteration  processes. 
Usually  one  iteration  step  or  iteration  of  the  parameters  only  is  suggested. 


*The  partial  correlation  considered  is  described  in  Section  2. 

**References  are  listed  on  page  53  . 
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For  not  correlated  data  and  special  forms  of  the  constraint  functionals 

different  algorithms  have  been  suggested  in  a  number  of  publications. 

7  8 

Some  of  the  recent  papers  are  by  Wentworth  (1965)  ,  Barieau  §  Dalton  (1966)  , 

9  in  ii  12 

York  (1966)  ,  Williamson  (1968)  ,  O'Neil  et  al.  (1969)  1  ,  Southwell  (1969)  , 

13 

and  Powell  8  Macdonald  (1972)  .  Of  these,  Powell  8  Macdonald's  algorithm 

has  convergence  properties  superior  to  others.  We  shall,  therefore,  compare 
our  numerical  results  with  those  reported  by  Powell  8  Macdonald.  The  most 
advanced  theoretical  basis  for  a  computing  algorithm  for  general  non-linear 

g 

constraints  is  given  in  the  paper  by  Barieau  8  Dalton  .  However,  their 
theory  is  restricted  to  the  case  where  only  one  observable  contains  obser¬ 
vational  errors.  Moreover,  the  description  of  the  iteration  process  is  in 
Reference  8  not  sufficiently  clear. 

The  estimates  of  parameter  variances  are  seldom  computed  correctly. 

Of  the  authors  mentioned  above,  Barieau  8  Dalton,  York,  Williamson,  and 
Southwell  derive  correct  error  expressions.  However,  these  are  applicable 
to  the  corresponding  special  cases  only. 

In  Section  2  of  this  paper  we  shall  demonstrate  the  theoretical 
deficiencies  of  some  widely  used  algorithms  and  derive  necessary  formulas 
for  a  simple  iterative  algorithm.  The  algorithm,  which  is  described  in 
detail  in  Appendix  A  is  applicable  to  problems  with  general  constraints, 
and  it  has  convergence  characteristics  which  are  equal  or  superior  to 
those  of  other  algorithms  published.  The  estimation  of  parameter  variances 
is  treated  in  Section  3.  Section  4  contains  examples  for  the  application 
of  the  algorithm.  Some  results  are  compared  in  Section  4  with  those 
of  other  authors. 

2.  COMPUTATION  OF  RESIDUALS  AND  PARAMETERS 

Let  the  mathematical  model  be  given  by  a  functionalrelationship, 

FU,9)  -  0,  (1) 

where  £  is  the  n-dimensional  position  vector  in  the  space  of  the  obser¬ 
vable  variables  and  0  is  the  p- dimensional  unknown  parameter  vector.  Let  r 
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points  Xj  (j  =  l,2,...,r)  be  observed  in  the  sample  space  and  r>p.  Let 
the  observational  errors  of  each  point  be  characterized  by  a  corres¬ 
ponding  variance-covariance  matrix  R^  with  the  dimension  (nxn) .  That  is, 
we  assume  no  correlation  between  the  different  observations  X . ,  and 
known  variances  and  covariances  of  the  components  of  each  X . .  Our  aim 
is  to  compute  a  "best  estimate"  t  of  the  parameter  vector  0,  compatible 

with  the  given  data  X.  and  R..  The  "best  estimate"  we  define  as  that 

^  ^  6 
value  of  0  which  is  obtained  by  a  least  squares  analysis  of  the  data. 


The  least  squares  analysis  is  based  upon  an  adjustment  of  the  ob¬ 
served  data  Xj»  Let  x^  be  the  adjusted  values  and 

x.  =  X.  +  c.  (2) 

J  J  J 

The  corrections  (adjustments,  negative  residuals)  c^  and  the  best  esti¬ 
mate  t  of  the  parameter  vector  are  thereby  defined  as  solutions  of  the 

12  3 

following  minimization  problem  with  r  constraints:  ’  * 

x 

|w  =  53  c{  R^c.  =  min.,  (3) 


j  =  1 


F  (X .  +  c  t)  =0, 
J  J 


( j  —  1,2 , • • . ,r) • 


(4) 


A  commonly  used  technique  for  the  solution  of  such  a  minimization 

problem  makes  use  of  Lagrange  multipliers  k.,  called  also  "correlates" 

1-5 

in  the  context  of  data  adjustment  .  According  to  that  technique  one 
minimizes  instead  of  W  the  functional 


W 


r 


k,  F (X .  +  c,,  t),  (5) 


subject  to  the  constraints  (4).  The  correlates  kj  are  determined 
simultaneously  with  the  other  unknowns,  c^  and  t. 

For  further  discussion  we  need  some  assumptions  about  the  functional 
F.  Since  we  intend  to  linearize  F  by  truncating  its  Taylor  expansion, 
we  assume  that  F  can  be  expanded  in  a  Taylor  series  within  a  finite 
neighborhood  of  each  point  (X^,  t) .  For  obvious  reasons  these  neighbor¬ 
hood  regions  must  be  sufficiently  large  to  include  the  points  (x^ ,t)  as 
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well.  In  all  practical  cases  these  assumptions  are  satisfied  because 
the  normal  purpose  of  the  adjustment  is  to  obtain  a  analytic  relation 
between  the  ^-components.  F  is  therefore  usually  chosen  such  that  it  is 
analytic  at  least  within  the  above  mentioned  regions. 


Let  A  =  grad^  F  be  the  n-dimensional  vector  of  the  partial  derivatives 
of  F  with  respect  to  the  components  of  5  and  B  =  grad.F  be  the  p-dimen- 
sional  vector  of  the  partial  derivatives  of  F  with  respect  to  the  com¬ 
ponents  of  0.  We  assume  that  neither  A  nor  B  vanish  within  the  above 
mentioned  neighborhoods  of  (X . ,  t) .  This  condition  excludes  some  singu¬ 
lar  problems  (B=0)  and  singular  points  (A=0)  from  our  considerations. 


We  obtain  necessary  conditions  for  a  minimum  of  W  by  setting  its 
partial  derivatives  with  respect  to  the  components  of  c^  and  t  equal 
to  zero.  Using  the  symbols  A  and  B  these  conditions  can  be  expressed  by 


R.3W/3C.  ■  c.  -  k,  R.  A(X .  +  c.,  t)  =  0,  (j  =  l,2,...r)  (6) 

3  j  j  j  j  j  J 


and 


~  r 

3W/3t  -  Z  k^BCXj  +  c.,  t)  =  0. 


(7) 


Equation  (6)  is  a  system  of  n  •  r  equations,  while  Equation  (7)  consists 
of  p  equations.  Together  with  the  r  Equations  (4)  we  have  thus  (n  +  l)r  +p 
equations  for  an  equal  number  of  unknowns,  namely,  n*r  components  of  the 
Cj ,  r  correlates  and  p  components  of  t.  Usually  the  equation  system 
(4),  (6)  and  (7)  is  non-linear. 


In  principle  it  can  be  solved  by  any  numerical  method  designed  for 
non-linear  systems  of  equations.  In  practice  such  an  approach  is  seldom 
considered  because  the  number  of  equations  is  often  very  large  (due  to  a 
large  r) .  Instead,  the  equations  are  linearized  and  the  solution  obtained 
by  an  iteration  process.  The  structure  of  the  linearized  equations 
makes  such  an  approach  particularly  attractive.  As  will  be  shown  later, 
the  linearized  equations  can  be  manipulated  in  such  a  manner  that  most 
of  the  unknowns  appear  in  explicit  expressions.  The  only  simultaneous 
equations  remaining  are  those  in  a  system  of  p  linear  equations. 
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The  proper  linearization  process  is  the  main  subject  of  this  section 
We  shall  derive  the  linearized  equations  and  show  that  their  solutions 
converge  indeed  to  the  solutions  of  the  original  problem.  In  contrast, 
the  usual  linearized  formulas  yield  wrong  solutions,  unless  F  is 
linear  in  £  and  B  is  independent  of  £. 

We  start  our  considerations  by  outlining  the  usual  linearization  pro 
cess  and  pointing  out  some  of  its  errors. 


Let  T  be  an  approximation  to  t  and, 

x  =  t  -  T 

Expanding  F  in  a  Taylor  series  at  the  point  (X . ,  T)  we  obtain. 


(8) 


T  T 

F  (X .  +  c.,  T  +  t)  =  F.  +  A.  c.  +  B.  x  +...,  (9) 

v  j  j*  J  jo  jo  j  jo  *  K  J 


where 


tj0  -  F(Xr  T),  A.o  =  ACXj.T). 

Bjo=  B(X.,  T). 


and 


The  supremum  of  the  error  introduced  by  truncating  the  series  after 
the  linear  terms  is  equal  to  one  half  of  a  maximum  value  of  the  second 
order  terms.  Let  the  matrices  of  the  second  order  derivatives  of  F  be 
denoted  as  follows. 


At  =  k  =< 


I  92F  ) 
3£186k) 


(10) 


i  1c  i  k 

where  £  ,  £  ,  0  ,  and  6  ,  are  the  components  of  £  and  0,  respectively. 
The  error  of  the  linearization  is  negligible,  if 


I  .  T  T  |  |  T  < 

A.  c.  +  B.  x  »lc.A  c  . 

1  JO  J  JO  1  *  J  X  J 


+  2c.A.x  + 
J  t 


^Vlmax* 


(ID 


whereby  the  second  order  derivatives  on  the  right  hand  side  of  Equation 
(11)  are  evaluated  at  such  a  point  between  (X^,T)  and  (X^  +  c^ ,  T  +  x) , 
which  yields  the  maximum  value  for  the  expression. 
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The  condition  (11)  is  obviously  satisfied  if  | |  and  |t|  are 
sufficiently  small.  Let  us  assume,  that  the  parameters  T  and  the  adjustments 
Cj  are  computed  by  an  iterative  process.  If  that  process  converges,  then 
T->-t,  that  is  x-K),  while  the  c^  approach  their  respective  least  squares 
values.  In  the  limit  the  linearization  error  is  then  bounded  by: 


e, .  <1/2  c .  A  c .  . 

1  lin 1  j  x  j 1  max 


(12) 


The  right  hand  side  of  Equation  (12)  is  zero  only  if  F  is  a  linear 
function  of  £.  Hence  only  in  that  case  we  can  be  confident  that  the 
linearization  of  the  constraints  F  =  0  does  not  affect  the  Equation  (4). 


This  result  is  insofar  important  as  very  often  the  constraint 
functionals  F  are  linear  in  the  parameters  0  but  not  in  the  observables 


We  shall  now  consider  cases  where  F  is  not  a  linear  function  of  £ 
and,  moreover,  make  no  assumptions  about  the  size  of  the  corrections 
Cj .  Particularly,  we  will  not  assume  that, 

I  I  A.  I  I » I  I A  c . I  I  (13) 

II  jo1 1  11  x  j 1 *max  v  J 

is  satisfied,  which  condition  would  justify  the  linearization  (9)  in 
case  Ay*0.  The  traditional  assumption  at  this  point  is,  that  the  ||c  .  || 

X  J 

are  sufficiently  small  so  that  (13)  is  satisfied.  We  note  in  passing 
that  (13)  might  not  be  satisfied  even  if  the  1 1 c ^ 1 1  are  "small"  because 
that  condition  depends  on  the  formulation  of  the  constraint  functional 
F.  Thus,  the  same  Cj 's  can  satisfy  (13)  for  one  functional  F  and 
fail  to  satisfy  (13)  if  F  is  formulated  differently. 

An  obvious  way  to  reduce  linearization  errors  is  to  expand  F 

at  a  point  which  is  closer  to  the  solution  than  (X.,T).  This  approach 

2 

has  been  suggested,  for  instance,  by  Tienstra  without  deriving  the 

13 

corresponding  formulas.  Powell  §  Macdonald  use  the  same  idea  for 
an  algorithm  which  is  different  from  ours. 


18 


(14) 


Let  Cj  be  an  approximation  to  and. 


x.  =  X.  +  C.  +  &  .. 

3  3  3  3 

Linearizing  the  equations  (4)  at  (X^  +  C . ,  T) ,  we  make  the  conventional 

assumption,  that  the  second  order  therms  are  small  compared  to  the  linear 

terms.  Let  F.  =  F(X.  +  C.,  T) ,  A.  =  A(X.  +  C.,  T)  and  B.  =  B(X.  +  C.,  T) . 

3  3  3  3  3  3  3  3  3 

The  linearization  conditions  we  may  express  then  by  the  following  inequa¬ 


lities,  which  we  assume  to  be  satisfied  for  j  =  l,2,*..,r: 

A .  |  |  » I  I A  e  .  +  A  x  |  I 

j  i  i  x  j  t  Mi 

T  „T 


1  max 


and 


>> lie.  8'  + 

1  1  j  x 


*\\ 


1  max 


(15) 

(16) 


whereby  the  maximum  is  to  be  taken  between  the  points  (X^  +  C . ,  T)  and 
(Xj  +  Cj  +  Ej,  T  +  x) .  Note  that,  unless  A^  and  B^  vanish,  the  inequalities 
(15)  and  (16)  can  be  satisfied  for  sufficiently  small  ,  and  t.  (Ac¬ 
cording  to  our  assumptions  about  F,  the  A^  and  B^  do  not  vanish  in  a 
neighborhood  of  =  0  and  x  =  0.) 

With  (15)  and  (16)  satisfied  we  reduce  Equations  (4),  (6),  and 
(7)  to  the  following  system  of  linear  equations  for  e . ,  k^,  and  x: 


and 


F.  + 

at 

T 

e  .  +  B .  x  =  0 , 

(j  = 

3 

3 

3  3 

C.  + 

3 

e . 

3 

-  k.  R.  A.  =  0, 

3  3  3 

(3  = 

T 

i 

k. 

B.  =  0 

=  i 

3 

3 

(j  =  l»2,,..,r), 


(17) 

(18) 

(19) 


We  shall  now  simplify  this  system  of  equations.  Let, 

1 


g-:  7 

3  a!  r.  a. 

3  3  3 


be  the  ''weight"  of  the  observation  j 


<1.2 


(20) 


Multiplying  Equation  (18) 


from  the  left  by  A^  and  subtracting  the  result  from  Equation  (17)  we 
obtain  an  explicit  formula  for  the  correlates  k^ : 
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kj  '  VAicj 


F.  -  B.  x), 
3  3 


(j  -  1,  2 , . . .  ,r) 


(21) 


Substitution  of  Equation  (21)  into  Equation  (19)  yields. 


r 

E 

j  =  i 


E  *(a!c4  -  F.)B4. 


J  3  3 


j  =  1 


J  J  J 


r  j 


(22) 


Substitution  of  Equation  (21)  into  Equation  (18)  furnishes  an  explicit 
formula  for  the  corrections  of  the  observations : 


C.  +  c  .  =  g  .(A.  C.  -  F.  -  B.x)R.A. 
3  3  J  J  J  3  3  3  3 


(j  —  1,  2 , . . . ,r) . 


(23) 


Equation  (22)  corresponds  to  the  "normal  equations"  in  linear  least 


squares  problems. 


It  furnishes  the  correction  x  of  T,  if  C.  and  T  are 

3 


given.  After  the  computation  of  x,  the  corresponding  corrections  +  e_. 
of  the  observations  can  be  computed  by  Equation  (23)  and,  after  replacing 
T  by  T  +  x  anc  by  +  z.,  the  process  repeated.  The  correlates  k^, 
if  needed,  can  be  computed  by  Equation  (21)  at  each  cycle.  Particulars 
about  an  iteration  algorithm  based  on  Equations  (21)  through  (23)  are  given 
in  Appendix  A  and  numerical  examples  shown  in  Section  4.  Here  we  note 
only  that  we  have  found  advantageous  to  iterate  Equation  (22)  several 
times  (by  replacing  T  by  T  +  x  in  all  arguments)  before  using  Equation 
(23).  Also,  Equation  (23)  should  be  iterated  (by  replacing  by  +  t^) 
before  returning  to  Equation  (22). 


If  the  iteration  converges  then  by  definition  c^  and  T  t. 

The  set  of  Equations  (21),  (22),  and  (23)  is  equivalent  to  the  set  of 
Equations  (17),  (18),  and  (19).  Hence,  at  the  limit  c^.  and  t  satisfy 
the  equations 

V°- 

c  .  =  k .  R.  A. , 

3  3  3  3 


(24) 


3  =  1 


It.  B.  =  0. 
3  3 
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Therefore,  the  K^,  x^  and  t  are  indeed  solutions  of  the  original  Equations 
(4),  (6)  and  (7).  The  correlates  converge  thereby  to  the  values 


k  .  =  g.  A.  c  . . 
1  1  J  J 


(25) 


When  the  iteration  starts,  then  the  =  0  and  the  right  hand 
side  of  Equation  (22)  is  equal  to  -  ^  g^F^Bj,  that  is,  equal  to  the 

commonly  used  right  hand  side  of  normal  equations.  However,  as  the 

iteration  proceeds,  the  F.'s  approach  zero  and  the  right  hand  side  of 

J  V*  T 

the  normal  equations  approaches  Ljg^AX.^By  which  is  the  expression  in 
Equation  (7) .  This  is  the  main  difference  between  the  commonly  used 

g 

algorithms  and  ours.  Of  the  cited  references  only  Bariean  &  Dalton  use 
a  right  hand  side  corresponding  to  Equation  (7). 


The  conventional  normal  equations  for  correlated  observations  are 


1,2,3 


2g.B.  b!t  =  2  k.B. 


J 


’JO  JO  JO 


J 


JO  JO 


with 


k  .  =  -  g  .  F  . 

J  jo  jo 


(26) 

(27) 

The  adjustments 

(28) 


The  Equations  (26)  and  (27)  are  iterated  until  x  =  0. 
are  then  computed  with 

c.  =  k  .  R.  A (X . ,  t). 

J  J  J  J 

The  difference  between  the  results  obtained  by  this  process  and  by 
our  algorithm  can  be  readily  seen  if  we  compare  the  corresponding 
formulas  for  the  correlates,  namely  Equations  (21)  and  (27).  These 
formulas  are  identical  at  convergence  only  if  F  is  a  linear  function  of 
£.  The  gradients  A^  are  then  independent  of  £  and  the  c^,  too,  are 
computed  by  identical  formulas  (cf.  Equation  (24)  and  (28)).  However, 
instead  of  2  k^B(Xj+Cj,t)  =  0,  the  iteration  of  Equations  (26)  and 

(27)  furnishes  2  k.B(X.,t)  =  0.  Hence,  unless  F  is  linear  in  E,  and 

j  3  3 

B  is  independent  of  £,  the  results  will  not  satisfy  Equations  (7). 
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We  note  in  passing  that  these  conditions  are  met  in  the  common 
case  of  "curve  fitting  with  errors  in  one  coordinate",  whereby  the 
constraints  must  be  solved  explicitly  for  that  coordinate.  The  constraints 
have  then  the  form, 

F(£,0)  =  5  -  f(0)  =  0  (29) 


The  functionals  f  contain  generally  different  constants  (the  error  free 
coordinates)  for  different  observations.  Such  a  functional  F  is  indeed 
linear  in  £  and  the  =  grad^  f . (0)  are  independent  of  £. 


We  have  thus  shown  that  while  usual  normal  equations  do  not  yield 
the  least  squares  solution  in  the  general  case,  only  trival  changes 
of  these  equations  are  necessary  to  insure  correct  solutions  for  finite 
corrections.  Moreover  the  employment  of  the  equation  system  (20)  through 
(23)  does  not  require  any  special  form  of  the  constraint  functional  F.  It 
is  also  interesting  to  note  that  our  equations  do  not  contain  higher  than 
first  order  derivatives  of  F. 


13 

The  use  of  higher  order  derivatives  is  advocated  by  Powell  §  Macdonald  . 

It  is  claimed  that  such  derivatives  (but  only,  if  they  are  computed  numerically!) 
will  increase  the  rate  of  convergence.  According  to  the  derivation  of  our 
equations,  higher  order  derivatives  can  be  retained  either  in  Equation  (18), 
or  in  Equation  (19),  or  both,  without  making  the  problem  non-linear  in 
e  and  t.  Inclusion  of  the  term  A^c  +  t  in  Equation  (18)  however, 
contradicts  the  assumption  (15),  which  was  made  for  the  linearization 
(17)  of  F.  Similarly  the  terms  B^c  +  B^t  in  Equation  (19)  can  be  neglected 
because  of  the  assumption  (16).  Hence  the  inclusion  of  these  terms, 
without  using  a  second  degree  approximation  for  F,  would  introduce  an 
inconsistency  in  the  linearization  process.  Since  it  also  leads  to  much 
more  complicated  equations  we  have  not  attempted  to  code  and  test  the 
performance  of  such  an  approach.  As  far  as  the  rate  of  convergence  is 
concerned,  our  algorithm  was  found  to  perform  equally  good  or  better 
than  that  of  Powell  §  Macdonald  as  shown  by  the  examples  in  Section  4. 
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3.  ESTIMATED  VARIANCES  AND  COVARIANCES  OF  THE  PARAMETERS 


An  essential  assumption  which  is  usually  made  in  error  analysis  is, 
that  all  variances  are  sufficiently  small  to  permit  a  linearization  of 
relations  involving  the  variances.  We  shall  carry  out  an  error  analysis 
for  the  least  squares  parameters  under  this  assumption.  In  our  case 
such  an  assumption  contradicts  in  a  sense  the  assumption  made  in  the 
previous  Section,  namely  that  the  estimated  corrections  are  finite. 

We  note,  however,  that  a  similar  contradiction  is  inherent  in  any  least 
squares  error  analysis.  Assuming,  for  instance,  normal  distribution  of 
observation  errors  we  admit  the  possibility  of  very  large  errors  and 
yet  estimate  their  effects  using  a  linear  model.  In  our  case  we  did  not 
impose  any  conditions  on  the  sizes  of  the  corrections  c^  in  order  to 
obtain  correct  formulas  for  the  least  squares  solutions.  Once  the  cor¬ 
rect  solutions  are  obtained  an  error  analysis  may  be  very  well  based  on  a 
"small  deviation"  theory.  As  usual,  the  results  will  not  be  applicable 
to  large  uncertainities  but  they  will  correctly  express  the  sensitivity 
of  the  results  to  observational  errors. 


According  to  Equations  (24)  and  (25)  we  have  the  following  relations 
between  the  least  squares  values  c^.  and  t: 

2  g  .B  aTc  .  =  0,  (30) 

j  t  p  J  J  J 


Fj  =0,  j  =  1, . . . ,r  (31) 

Cj  =  g-j^jAjAj  c j ,  j  =  l,»..r.  (32) 

We  are  interested  in  variations  dX^  of  the  observations  ,  corresponding 
variations  dt  of  the  parameter  t,  and  variations  dc^  of  the  residuals 
c . ,  such  that  the  Equations  (30)  through  (32)  hold.  We  obtain  the 
desired  relations  between  dXj ,  dt  and  dc^.  by  differentiating  these 
equations  and  setting  the  differentials  equal  to  zero.  To  simplify  the 
notation  we  shall  omit  the  index  j  from  our  formulas. 
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Differentiation  of  Equations  (31)  yields  thus, 

ATdX  +  ATdc  +  BTdt  =  0. 


(33) 


Differentiation  of  Equation  (30)  yields  after  some  manipulation  and 
use  of  Equation  (25) 


2}(-gBcTA  +  kB  )  (dX  +  dc)  +  gBATdcf  + 

t  A 

2  C-gBcTAt  +  kB  )  dt  =  0 


(34) 


In  order  to  estimate  the  variance  of  t  we  have  to  express  dt  in  terms 

of  the  observation  variations  dX,  that  is,  we  have  to  eliminate  dc  from 

T 

the  equations.  Hie  product  A  dc  can  be  eliminated  from  Equations  (34) 
with  the  aid  of  Equation  (33).  The  remaining  terms  containing  dc  can 
be  eliminated  by  making  use  of  the  differential  of  Equations  (32),  which 
is 


dc  =  -kR(gAA  R  -  1)  A  (dX  +  dc)  + 
+  gRAATdc  _  kR(gAA^R  -  I)A  dt. 


(35) 


T 

After  elimination  of  A  dc  from  Equation  (35)  with  the  aid  of  Equation  (33) 
and  rearanging  of  terms  we  obtain 


{I  +  kR(gAATR  -  I)Ax|  dc  = 

=  R|-gAAT  -  k (gAATR  -IjA^dX  +  (36) 

+  R{-gABT  -  k  (gAATR  -  I)  AJdt. 

Equation  (36)  can  be  solved  for  the  dc  if, 

det  |  1+  kR(gAA^R  -  I)Ax|  f  0  (37) 


The  condition  (37),  replaces  for  our  analysis  the  usual  assumptions, 

I  +  kR(gAATR  -  I)A  »  I  (38) 

or 

A  =  0  (39) 

x 

It  is  essentially  a  condition  for  the  sizes  of  the  residuals  c,  because 
T 

k  =  gA  c ,  according  to  Equation  (25).  Let, 

G  -  I  +  kR  (gAATR  -  I)A  ,  (40) 
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T1  =  G~h  j-gAAT-k(gAATR-I)/^} 


and 


r2  =  6-1R  j-gABT-k(gAATR  -  I  jA^ 
Equation  (36)  yields  then  under  the  assumption  (37) , 


dc  =  Tj  dx  +  r2  dt. 


(41) 

(42) 

(43) 


Elimination  of  dc  between  Equation  (33),  (34),  and  (43)  gives  finally  the 
desired  relation  between  dt  and  dX: 

,T  .  Ti  T, 


2  {  gBB1  +  gBc1At  -  kBt  +  (gBc1Ax  -k  8x)r2  }  dt  = 
=  2  { -gBAT  -  (gBcTAx  -k  8x)  (  I  +  rp  }  dX. 


This  equation  is  of  the  form 

r 


©dt  =  y  H.  dX. . 
3  =  1  3  3 


(44) 


(45) 


with  obvious  meanings  of  ©and  H..  We  apply  the  general  error  propagation 
12  3  J 

formulas  *  *  to  Equation  (45)  in  two  steps.  First,  we  compute  the 

contribution  M.  of  the  observation  X.  to  the  variance  estimate  of  t.  M. 

J  3  1  2  3 
is  according  to  our  assumptions  about  the  observations  equal  to  ’  : 


M.  =  m2  H.R.hT 
3  OJ33 


(46) 


where  mQ  is  a  proportionality  factor  to  be  discussed  later.  The  estimated 

variance -covariance  matrix  of  the  components  of  t  is  then  equal  to 

x 

V  =  m2  ©_1  (  2  M  )  (©“V.  (47) 

3  =  1  3 

It  can  be  easily  verified,  that  in  case  we  neglect  all  second  order 

derivatives  in  Equation  (44),  V  becomes  equal  to  the  conventionally 

used  variance- covariance  matrix, 

x* 

V  =  m2  (  2  g . B . bT) - 1  (48) 

conv  o."6j;j;r 

j  =  l  J  J  J 

which  is  proportional  to  the  inverse  of  the  normal  equation  matrix  in 
Equation  (22). 
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We  note,  however,  that  even  if  the  constraint  functional  6) 

T 

is  linear  in  both  arguments,  £  and  0,  the  matrix  =  8^  does  not 

vanish.  Hence  the  conventional  variance- covariance  matrix  \J  4  \J 

conv 

even  in  linear  multi-dimensional  adjustment  problems. 

We  pointed  out  in  Section  2  that  conventional  least  squares  algorithms 
provide  correct  results  (parameters  and  residuals)  if  the  constraint 
functional  is  of  the  form. 


K  -  f (6)  =  0 


(49) 


with  one- dimensional  £.  For  the  variance- covariance  matrix  l /  we  obtain 
in  that  case  the  formulas. 


2  (gBB  -kBt), 
B, 


2 

m  g . 
o 


BT 
3  3 


(50) 

(51) 


and 


V  =  [Z(gBBT  -  k8t)]_1  (ZgBBT)  |  [  2  (gBBT  -  k8t)  ]_1  }  T  (52) 

This  matrix  is  equal  to  the  conventional  ^conv  of  Equation  (48)  only  if 
B  =  0,  that  is,  if  the  functional  f(0)  in  Equation  (49)  is  linear  in 

e. 


In  summary  then  the  conventional  least  squares  algorithms  can  be 
safely  used  only  if  the  following  conditions  are  all  satisfied. 

(a)  In  each  observation  only  one  component  is  adjusted. 

(b)  The  constraint  functional  is  of  the  form  (49). 

(c)  The  functional  f(0)  is  linear  in  the  parameter  0. 

In  all  other  cases  Equation  (47)  should  be  used  instead  of  Equation  (48). 
Examples  in  Section  4  will  show  that  in  general  the  importance  of  the 
higher  order  derivatives  cannot  be  assessed  in  advance.  Either  Equation 
(47)  or  Equation  (48)  can  furnish  larger  diagonal  elements  of  the  matrix. 
Moreover,  corresponding  off-diagonal  elements  of  V  and  ^conv»  respectively, 
can  even  have  opposite  signs.  These  facts  makes  ^conv  virtually  use¬ 
less  as  an  approximation  to  the  estimated  variance-covariance  matrix. 


26 


The  complexity  of  the  formulas  for  1/  is  of  little  practical  con¬ 
sequence,  because  these  formulas  are  evaluated  only  once  for  each  adjust¬ 
ment  problem.  The  establishing  of  the  necessary  second  derivative 
expressions  can  be  a  more  serious  problem.  That  problem,  however, 
might  be  attacked  by  using  a  symbol  manipulation  code  for  the  computation 
of  complicated  derivatives. 


2 

The  proportionality  factor  mQ  ("variance  of  weight  one")  should  be 
set  equal  to  one,  if  the  variances  R^  of  the  data  are  well  known.  Other¬ 
wise,  mostly  the  value. 


*2 

mo  = 


— W  =  -i-  2  c  T  RT 1  c  . 
r-p  r-p  j  =  1  }  3  3 


(53) 


is  used*,'*,^,°  as  an  approximate  to  m2.  Macdonald  (1969)^  has  shown 
that  a  better  estimate  of  m2  is  obtained  if  W  in  Equation  (53)  is  cor¬ 
rected  by  a  term,  which  in  his  case  (n  =  2,  R^  =1  )  is  the  square  of 
an  algebraic  sum  of  the  weighted  residual  norms  |  |c^|  |.  The  sign  of 
each  term  is  thereby  assumed  positive  or  negative,  depending  on  the 
position  of  above  or  below  the  curve  F  =  0.  In  our  case  a  corresponding 
correction  can  be  obtained  by  observing  that  W  is  the  weighted  sum  of 
the  correlate  squares,  namely. 


W  =  2cTR_1c  =  Ek  ATRR-1RA.k  = 

=  2k2ATRA  =  2— 2 
g 


(54) 


Following  Macdonald  we  compute  the  algebraic  average  of  the  terms  k  -H g-  by, 

r  k  J  •* 

k  =  b  2  (55) 

,  j=l  g3 


and  define  m  by 
o  7 


m2  Up- 

0  r'p  j=iV*j 


) 


(56) 


1  —  2 
—  (W  -  rk  ). 
r-p  v  J 


The  correlates  kj  are  related  to  the  residuals  by. 


kj  =  gjcjAj  =  gjcj  grade  V 


(57) 
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The  sign  of  depends,  therefore,  on  the  location  of  the  observation 

with  respect  to  the  hypersurface  F=0:  for  all  Xj  on  one  side  of  the 

hypersurface  the  products  c-  grad_  F.  have  equal  sign,  if  we  assume  that 

J  £  J 

the  adjustments  are  such,  that  the  straight  line  from  Xj  to  x^  does  not 
penetrate  the  hypersurface.  (If  it  does,  our  solution  is  not  the  best 
least  squares  solution.  Such  singular  cases  we  do  not  consider  here.) 
Hence  the  sign  convention  in  Equation  (56)  is  the  same  as  the  one  used 
by  Macdonald  for  curve  fitting. 


4.  EXAMPLES 


In  this  Section  we  shall  present  some  examples  of  computations  by 

the  method  described  in  previous  Sections.  First,  we  shall  compare  the 

13 

performance  of  our  method  with  that' of  Powell  §  Macdonald  ,  second,  we 
shall  demonstrate  the  importance  of  correct  variance  estimates  in  an 
example  of  recent  work  at  BRL  and,  third,  we  shall  give  an  example  of 
approximation  by  multivalued  curves.  For  all  computations  presented 
here,  the  computer  code  was  used  which  is  described  in  Appendix  A. 

In  the  first  group  of  examples  polynomials  are  fitted  in  a  two- 
dimensional  space  to  a  data  set  of  10  points.  Let  the  coordinates 
in  the  space  of  observables  be  called  x  and  y,  respectively.  The 
polynomial  constraints  are  then  of  the  form, 

P  i-1 

F(x,y;e)  =  y  -  2  0-  x  =o 

i=l 

The  data  (Table  I)  are  taken  from  Reference  13.  The  coordinates  of  the 

14 

data  points  were  originally  given  by  Pearson  (1901)  and  the  weights 

9 

are  due  to  York  (1966)  .  For  our  calculations  we  computed  the  variance- 
covariance  matrices  R  from  York's  weights  by  the  formulas. 


r 

r 

r 


11 

22 

12 


1/Wj  , 

l/w2  , 
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where  r..  are  the  elements  of  R.  We  have  also  fitted  Pearson's  data 
assuming  unit  weights.  In  those  cases  all  the  R's  were  assumed  to  be 
unit  matrices. 

Whenever  it  was  possible,  we  have  compared  our  results  with  those 
13 

of  Powell  §  Macdonald  .  Those  authors  have  shown  that  their  algorithm 
is  better  for  fitting  Pearson's  data  than  a  number  of  other  methods. 
Therefore,  a  comparison  with  Powell  §  Macdonald's  results  is  a  sensible 
test  for  our  method.  In  the  sequel  we  shall  refer  to  Powell  §  Macdonald's 
method  by  calling  it  the  PM  algorithm. 

Table  II  displays  the  results  ofa  linear  fit  to  Pearson's  data 
with  unit  weights.  The  crucial  end  condition  for  the  iteration  in  this 
and  all  other  examples  was  that  the  absolute  changes  of  the  parameters 

_7 

between  cycles  become  less  than  10  times  a  "crude  standard  error"  of 

the  corresponding  parameter.  (For  particulars  see  Appendix  A.)  The 

"crude  standard  error"  was  thereby  obtained  by  taking  the  inverse  of 

2 

the  normal  equation  matrix  times  mQ  as  an  approximation  to  the  variance- 
covariance  matrix.  These  crude  error  estimates  were  computed  at  the 
end  of  every  cycle  for  the  sole  purpose  to  check  the  end  conditions. 

The  final  standard  error  estimates  were  obtained  after  completed  iteration 
using  the  formulas  derived  in  Section  3.  The  differences  between  both 
estimates  show  that  second  order  derivatives  have  an  effect  on  error 
estimates  even  in  this  rather  trivial  linear  case. 

Another  result  which  is  typical  for  all  our  computations  is  the 
fact  that  W  becomes  stationary  within  computing  accuracy  (about  15 
decimal  digits)  before  the  end  of  the  iterations.  This  indicates,  that 
near  its  minimum  W  is  relatively  insensitive  to  small  changes  of  the 
parameters . 

The  results  of  the  computation  are  displayed  graphically  in  Figure 
la,  where  the  observed  points,  corrected  observations,  the  error  ellipse 
for  each  observation,  and  the  fitted  line  with  its  one  standard  error 
confidence  limits  are  plotted.  The  error  ellipses  are  circles  in  the 
present  case  with  unit  weights.  The  factor  mQ,  computed  with  Equation 
(56),  is  included  in  the  graphical  as  well  as  numerical  results  in  this 
and  other  examples. 
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Table  III  and  Figure  lb  show  the  results  of  a  linear  fit  to  Pearson's 

data  with  York's  weights.  This  case  can  be  compared  with  similar  calcu- 

13 

lations  by  the  PM  algorithm,  the  results  of  which  are  included  in  Table 

III.  The  comparison  indicates  a  slightly  better  rate  of  convergence 

for  our  method.  The  respective  final  parameter  values  are  equal  within 

five  places  for  both  methods.  There  are,  however,  significant  differences 

in  the  error  estimates  computed  by  our  method  and  by  Powell  §  Macdonald. 

Our  estimates,  which  are  computed  using  the  formulas  of  Section  3,  agree 

10  12 

with  those  computed  by  Williamson  (1968)  and  Southwell  (1969)  .  The 

estimates  of  the  PM  algorithm  are  obtained  in  the  same  manner  as  our 
"crude  error  estimates",  that  is,  by  inverting  the  normal  equation  matrix. 
The  Powell  8  Macdonald  error  estimates  differ  from  our  crude  estimates 
because  the  normal  equation  matrix  of  the  PM  method  is  different  from 
ours.  Elements  of  that  matrix  consist  in  the  PM  method  of  certain  second 
order  derivatives  of  W  which  are  computed  by  numerical  differentiation. 

No  theoretical  basis  for  the  computation  of  error  estimates  in  this  manner 
is  given  in  Reference  13.  Moreover,  in  the  PM  algorithm  certain  derivatives 
are  neglected  in  order  to  simplify  the  equations.  We  conclude,  there¬ 
fore,  that  the  error  estimates  obtained  by  Powell  8  Macdonald  are  smaller 
because  significant  terms  in  the  error  equations  have  been  neglected. 

Tables  IV  and  V  and  Figures  2c  and  2b  give  the  results  of  fitting 
a  cubic  to  Pearson's  data.  In  case  of  unit  weights  the  results  can 
again  be  compared  with  those  of  the  PM  algorithm.  The  respective  para¬ 
meters  computed  by  either  method  agree  within  the  five  places  given  in 
Reference  13.  The  rates  of  convergence  are  comparable.  The  error 
estimates  are  again  different.  Our  formulas  furnish  estimates  which  are 
bigger  than  the  conventionally  computed  crude  estimates,  while  the 
estimates  given  by  Powell  §  Macdonald  are  smaller  than  our  crude  ones. 

Tables  VI  and  VII,  and  Figures  3A  and  3B  give  the  restilts  of 
quintic  fits  to  Pearson's  data.  The  comparison  with  the  PM  algorithm 
in  the  case  of  unit  weights  (Table  VI)  reveals  a  weakness  of  that 
algorithm  in  addition  to  the  inaccurate  error  estimates.  This  time  the 


30 


parameter  values  computed  by  Powell  §  Macdonald  are  different  from  ours. 

-2 

The  relative  magnitudes  of  the  differences  are  as  large  as  2*10 

Differences  of  such  magnitude  are  suspicious  because  the  iteration  end 

conditions  in  both  algorithms  required  that  the  relative  changes  of 

—  6  ""7 

the  parameters  between  cycles  be  less  than  10  (less  than  10  for  the 
PM  algorithm) .  In  order  to  find  out  which  is  the  "better"  parameter 
set  we  have  recalculated  the  W  value  for  the  parameters  of  the  PM 
algorithm.  (Powell  5  Macdonald  give  that  value  with  five  places  only.) 
The  result,  shown  in  Table  VI,  indicates  that  our  parameter  set  produces 
a  smaller  W  value.  The  reason  why  the  PM  algorithm  fails  in  this  case 
to  converge  to  a  minimum  of  W  is  likely  to  be  found  in  the  computation 
of  derivatives  of  W  by  numerical  differentiation.  Since  near  its 
minimum  W  is  rather  insensitive  to  changes  of  parameter  values,  the 
numerically  computed  second  derivatives  have  low  accuracies.  One  would 
expect  that  this  behavior  of  W  has  a  detrimental  effect  on  the  results 
of  the  PM  algorithm,  particularly  when  the  number  of  parameters  is  not 
very  small.  Our  results,  on  the  other  hand,  are  not  influenced  by  the 
above  mentioned  behavior  of  W,  because  the  numerical  value  of  W  is 
not  used  in  our  calculations. 

Our  next  example  illustrates  the  importance  of  accurate  error 
estimation  procedures.  In  this  example  the  constraint  functional  has 
the  form. 


where, 

a,f  and  p  are  observables, 
a0,  fQ,  pQ  and  a  are  parameters, 

and 

n  =  m  =  0.44. 


31 


This  approximation  problem  arises  in  the  evaluation  of  acoustic  amplifi¬ 
cation  measurements.  (About  the  physical  background  of  such  measurements 
see  Ibiricu  (1966) ***.)  The  data  set  to  be  fitted  is  given  in  Table  VIII 
and  the  results  are  shown  in  Table  IX.  The  crude  error  estimates  are 
in  this  case  of  the  same  order  aa  our  accurate  estimates.  However, 
comparing  corresponding  elements  in  the  full  variance-covaraince  matrices 
we  notice  that  one  of  the  off-diagonal  elements  has  a  wrong  sign  in  the 
inverted  normal  equation  matrix.  Hence  the  use  of  that  matrix  as  an 
approximation  to  the  variance- covariance  matrix  can  lead  to  serious 
errors,  even  if  the  diagonal  elements  (i.e.,  parameter  error  estimates) 
are  of  the  right  order.  If  different  methods  for  variance  estimates 
are  compared  numerically,  then  obviously  the  full  variance- covariance 
matrices  should  be  compared  and  not  only  the  diagonal  elements.  Fitting 
the  same  functional  to  other  data  sets  we  obtained  in  some  cases  parameter 
error  estimates  which  were  up  to  three  times  larger. than  the  corresponding 
crude  estimates,  while  in  other  cases  the  crude  estimates  were  larger. 

It  is  thus  obvious  that  the  conventional  crude  error  estimates  are 
practically  useless,  because  one  never  knows  whether  they  are  too  large 
or  to  small.  Moreover,  the  signs  of  the  crude  off-diagonal  covariance 
estimates  can  be  wrong. 

Our  next  example  demonstrates  the  usefullness  of  the  general  formu¬ 
lation  of  the  constraint  functional  in  the  case  of  multi-valued  functions. 
In  order  to  make  the  presentation  of  the  results  simple  we  have  chosen 
a  two-dimensional  space  of  observables  (x,y  -  plane)  and  fitted  a 
generalized  Cassinian  curve.  This  example  shows  also  how  correlations 
between  observables  natuaally  enter  the  calculations. 

We  assumed  that  the  distance  r  and  the  angles  of  sight  v?  were 
measured  independently  and  that  the  corresponding  error  estimates  eere 
e  and  e^,  respectively.  The  cartesian  coordinates  of  an  observation 
(r,<£)  are  then  given  by. 
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x  =  r  cosy> 
y  =  r  siny> 

while  the  variance- covariance  matrix  for  x  and  y  is, 

(  22  222  222  \ 
e^  cos  y>  +  r  e^,  sin  y>  («r  -  r  e^>  )  sin  y>  cos  y>  \ 

222  22222/ 

(er  -  r  e^  )  siny>  cosy)  er  sin  y>  +  r  ey>  cosy>  / 

The  cartesian  coordinates  of  the  (assumed)  observations  and  the  error 
estimates  e  and  e^,  are  given  in  Table  X. 

The  constraint  functional  was  formulated  as  follows, 

F  =  [(x  -  Xj)2  +  (y  -  yx)2]*  [(x  -x2)2  +  (y  -  y2)2  *  b]  -  a  =  0 

This  functional  depends  on  six  parameters,  namely  x^,  y^,  x2,  y2,  b  and 

a.  For  b  =  1  and  a  >  0  we  have  a  regular  Cassinian  curve.  A  reformu¬ 
lation  of  F  =  0  in  polar  coordinates  is  not  necessary  because  our 

method  takes  the  correlations  between  x  and  y  into  account. 

The  curves  F  =  0  are  multiple  valued  for  both  coordinates,  x  and 
y.  Since  in  our  method  the  constraint  functional  need  not  be  solved  for 
one  observable,  no  further  manipulation  of  the  constraint  functional 
is  necessary,  except  for  the  computation  of  derivatives  of  F. 

The  results  of  the  adjustment  are  shown  in  Table  XI  and  in  Figures 
4a  and  4b.  Comparing  the  estimated  variance- covariance  matrix  with 
the  inverted  normal  equation  matrix  we  notice  in  this  case  again  differences 
of  signs  of  some  off-diagonal  elements.  A  consequence  of  these  and  other 
differences  between  corresponding  elements  can  be  seen  by  comparing 
the  confidence  curves  in  Figure  4a  with  those  of  Figure  4b. 

In  order  to  test  the  significance  of  the  correlations  between 
x  and  y  we  have  adjusted  the  same  data  assuming  unit  weights  of  obser¬ 
vations.  The  results  are  given  in  Table  XII  and  Figures  5a  and  5b.  As 
expected,  the  final  parameter  values  are  different  from  those  of  Table  XI. 
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The  purpose  of  the  last  example  was  to  illustrate  the  treatment  of 

correlated  data  and  to  show  the  influence  of  the  correlations  on  the 

results.  Since  the  theory  presented  in  this  paper  is  restricted  to 

constant  variance- covariance  matrices  R.,  we  have  carried  out  all 

J 

calculations  under  that  assumption.  Constant  R^  are  typical  for  situa¬ 
tions  where  the  variances  and  covariances  of  the  data  are  obtained 
by  an  analysis  of  the  measurement  process  and/or  by  linear  coordinate 
trans  f ormati ons . 

In  the  case  described  in  the  last  example,  however,  such  a  treatment 
is  strictly  speaking,  not  correct.  The  R^  were,  namely  obtained  from  the 
original  variance- covariance  matrices  (diagonal  and  constant  in  the 
example)  by  a  non-linear  coordinate  transformation.  One  consequence  of 
such  a  transformation  is,  that  the  elements  of  R^  are  not  constants  but 
depend  on  the  coordinates  x  and  y,  and  are  subject  to  change  whenever 
x  and  y  are  corrected.  A  more  serious  consequence  is  that  the  basic 
equations  of  Sections  2  and  3  must  be  modified  if  non-linear  coordinate 
transformations  are  involved.  We  intend  to  treat  such  transformations 
in  a  forthcomming  paper. 
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FIGURE  2a. 


FIGURE  2b. 


CUBIC  FIT  TO  PEARSON'S  DATA 
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FIGURE  3a. 


FIGURE  3b. 


QUINTIC  FIT  TO  PEARSON'S  DATA 
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FIGURE  4a.  CORRECT  ERROR  ESTIMATES. 
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FIGURE  4b.  CONVENTIONAL  ERROR  ESTIMATES. 
FITTING  OF  PSEUDO -CASSIN IAN  CURVES 


FITTING  OF  PSEUDO-CASSINIAN  CURVES 
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TABLE  I 


PEARSON ' S  DATA  AND  YORK*S  WEIGHTS 


Nr 

Xj(=x) 

x2C-y) 

W1 

W2 

1 

0.0 

5.9 

1000.0 

1.0 

2 

0.9 

5.4 

1000.0 

1.8 

3 

00 

• 

pH 

4.4 

500.0 

4.0 

4 

2.6 

4.6 

800.0 

8.0 

5 

3.3 

3.5 

200.0 

20.0 

6 

4.4 

3.7 

80.0 

20.0 

7 

5.2 

2.8 

60.0 

70.0 

8 

6.1 

2.8 

20.0 

70.0 

9 

6.5 

2.4 

1.8 

100.0 

10 

7.4 

1.5 

1.0 

500.0 
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TABLE  II 


LINEAR  FIT  TO  PEARSON'S  DATA  WITH  UNIT  WEIGHTS 


Cycle 

Nr. 

T1 

T2 

W 

0 

0. 

0. 

154. 

1 

5.761  185  19 

-0.539  577  275 

0.620  119  637 

2 

5.783  849  45 

-0.545  510  327 

0.618  572  871 

3 

5.784  042  12 

-0.545  560  765 

0.618  572  759 

4 

5.784  043  76 

-0.545  561  193 

0.618  572  759 

5 

5.784  043  77 

-0.545  561  197 

0.618  572  759 

Standard 

0.1917 

0.04277 

Error 

Crude 

St.  Error 

0.1899 

0.04223 

W  =  0.618  572  759  437 
Last  Change  of  W  =  4*10 


—2  -35 

it  =  8*10  00 


in  =  0.278  067  6 
o 

Variance- Covariance  Matrix 


( 


3.673*10 


-2 


-6.989*10 


-3 


f  V  /  ^  XV  v  #  ^  v*'  AV 

-6.989*10“3  1. 830*  10~3 
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TABLE  III 


LINEAR  FIT  TO  PEARSON'S  DATA  WITH  YORK'S  WEIGHTS 


Cycle 

Present  Method 

Powell  8  Mac  Donald^ 

Nr. 

T1 

T 

2 

T1 

T2 

0 

1 

2 

3 

4 

5 

0 

5.396  016  28 

5.479  941  92 

5.479  909  72 

5.479  910  23 

5.479  910  22 

0 

-0.463  441  754 

-0.480  540  364 

-0.480  533  321 

-0.480  533  409 

-0.480  533  407 

5.396  1 

5.398  2 

5.477  5 

5.479  9 

5.479  9 

-0.463  45 

-0.463  88 

-0.479  98 

-0.480  53 

-0.480  53 

Standard 

Error 

0.3549 

0.07004 

0.252 

0.0496 

Crude 

St.  Error 

0.3585 

0.07048 

W  =  11.866  353 

m0  =  1.218 

W  =  11.866  353  1941 

Last  change  of  W  3«10^ 
k2  =  4.573*W3 

m0  =  1.215  556 

Variance  -  Covariance  Matrix 

/  1.259-10"1  -2.392*10-2  | 

l-2.392*10~2  4.905?10-3y 
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TABLE  IV 

CUBIC  FIT  TO  PEARSON'S  DATA  WITH  UNIT  WEIGHTS 


Cycle 

Nr. 


T 


3 


0 

1 

2 

3 

4 

5 

6 
7 


0. 

5.998  796  76 
6.015  242  55 
6.015  277  65 
6.015  265  21 
6.015  263  84 
6.015  263  74 
6.015  263  73 


0. 

-1.004  987  63 
-1.000  068  79 
-0.999  855  467 
-0.999  837  200 
-0.999  835  470 
-0.999  835  356 
-0.999  835  347 


0. 

0.157  037  148 
0.152  558  564 
0.152  477  162 
0.152  472  109 
0.152  471  633 
0.152  471  604 
0.152  471  602 


0. 

-1. 

-1. 

-1. 

-1. 

-1. 

-1. 

-1. 


T 


4 


371  381  65*10 
324  759  43 
324  093  64 
324  056  72 
324  053  08 
324  052  87 
324  052  86 


Standard 

Error 


0.3868 


Crude 
St.  Err. 


P  8  M13 
St.  Err. 


0.3663 


0.265 


0.4400 


0.4098 


0.298 


0.1341 


0.1276 


0.0918 


1.153*10 


1. 121* 10~2 


0 . 80  •  10 


-2 


W  =  0.485  152  486  927 
Last  Change  of  W  =  8*10 
K2  =  1.404  *  10"7 


m  =0.284  356  3 
o 


Variance-Covariance  Matrix  Times 


10 


3 


(Upper  Half  Shown  Only) 


140.9  35.59 

193.6  -  56.87 

17.99 


2.637 \ 
4.586  \ 

1.521 
0.1329  / 
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TABLE  V 


CUBIC  FIT  TO  PEARSON'S  DATA  WITH  YORK'S  WEIGHTS 


Cycle 

Nr. 

T1 

T2 

T3 

T4 

0 

-AM 

0. 

1 

0. 

0. 

1 

■RHI 

■■■ 

421 

-0.916 

872 

84 

0.085 

556 

903 

-0.432 

499 

89 • 10-< 

2 

6.258 

917 

93 

-1.256 

138 

96 

0.203 

086 

978 

-1.566 

544 

00 

3 

lmgm 

ALU 

137 

-1.095 

986 

50 

0.154 

063 

801 

-1.133 

672 

56 

4 

6.142 

26 

-1.108 

148 

28 

0.157 

130 

861 

-1.155 

755 

74 

5 

6.142 

213 

53 

-1.108 

214 

45 

0.157 

114 

212 

-1.155 

325 

34 

6 

6.142 

353 

27 

-1.108 

387 

74 

0.157 

166 

014 

-1.155 

767 

22 

7 

6.142 

316 

13 

-1.108 

335 

36 

0.157 

148 

606 

-1.155 

604 

69 

8 

6.142 

334 

88 

-1.108 

360 

68 

0.157 

156 

748 

-1.155 

678 

74 

9 

6.142 

326 

99 

-1.108 

349 

92 

0.157 

153 

261 

-1.155 

646 

83 

10 

6.142 

329 

74 

-1.108 

353 

56 

0.157 

154 

414 

-1.155 

657 

20 

11 

6.142 

329 

38 

-1.108 

353 

17 

0.157 

154 

310 

-1.155 

656 

39 

12 

6.142 

329 

42 

-1.108 

353 

23 

0.157 

154 

333 

-1.155 

656 

62 

13 

6.142 

329 

40 

-1.108 

353 

20 

0.157 

154 

320 

-1.155 

656 

51 

Standard 

Error 

1.028 

0.7692 

0.1794 

1.324 

•10"' 

2 

Crude 

Error 

1.034 

0.8214 

0.2102 

1.702 

•10“ 

2 

W  =  10.486  904  057  7 


Last  Change  of  W  =  10 
k2  =  2.352*10~3 
mQ  =  1.320  567 

3 

Variance- Covariance  Matrix  Times  10 
(Upper  Half  Shown  Only) 

(1058.  -  730.8  149.6  -  9.334  \ 

591.7  -  133.4  8.984  \ 

32.19  -  2.305 

0.1753  ) 
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TABLE  VI 


QUINTIC  FIT  TO  PEARSON'S  DATA  WITH  UNIT  WEIGHTS 


1 

Present  Method 

Powell  S  MacDonald^ 

1 

Parameters  t^ 

Last  AT. 

i 

Standard 

Errors 

Parameters  t^ 

Standard 

Errors 

1 

5.914  825  96 

-6*10'9 

0.411  9 

5.915 

0.288 

2 

-0.603  166  896 

5*10~8 

1.748  0 

-0.6034 

1.20 

3 

-8.032  030  78*10-2 

-5*10-8 

168.9*10-2 

-8.003 *10"2 

117*10-2 

4 

2.632  202  02*10~2 

2*10-8 

60. 13* 10-2 

2.622*10"2 

41 .7*10-2 

5 

-8.277  185  40*10“4 

-3*10'9 

896. 8*10”4 

-8. 119*10-4 

623*10"4 

6 

-1.675  050  59*10-4 

i*io-10 

47.46*10"4 

-1.683*10‘4 

33*10"4 

Number  of  Cycles:  10 

(Starting  with  T^=  0) 

Number  odf  Cycles:  10  1 

(Starting  with  ApproximJ 

W 

=  0.450  325  667  217 

■i  ^ 

W  =  0.450  329 

283  54 

Last  Change  of  W  =  3*10~ 

(Computed  using  (A-l), 

k 

2  =  1. 136*10~8 

(A- 2) ,  and 

(A- 4)) 

m 

=  0.335  531  50 

0 

m  =  0  033  55 
o 

Variance  -  Covariance  Matrix  Times  10 
(Upper  Half  Shown  Only) 

169.7  -  428.5  316*1  -  96.51 


12.95  -  0.6330 


/  3056. 

-  2869. 

977.3  - 

139.6 

7.100 

f 

2854. 

1005. 

146.8 

-  7.583 

1 

/ 

/ 

l 

361.6  - 

53.62 

2.803 

8.043 


0.4241 

2.253*10' 
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TABLE  VII 


QUINTIC  FIT  TO  PEARSON'S  DATA  WITH  YORK'S  WEIGHTS 


i 

Parameters  t. 

l 

Last  AT. 

'  l 

Standard 

Errors 

Crude 

St.  Errors 

i 

6.029  451  86 

2*10-8 

1.508 

1.503 

2 

-1.530  034  23 

-2  -10~7 

3.539 

3.419 

3 

0.817  877  33 

2*10“? 

2.805 

2.647 

4 

-0.294  920  02 

-7*10~8 

0.9164 

0.8548 

5 

4.698  541  20  * 10-2 

1-10"8 

13.16-10-2 

12.30*10-2 

6 

-2.666  420  13* 10"3 

-6* 10-10 

6.876* 10~3 

6.528* 10“2 

NUMBER  OF  CYCLES:  13  (Starting  with  T±  =  0) 
W  =  9 .505  013  741  86 
Last  Change  of  W  =  9*10-1^ 
k2  =  1.931* 10"3 


mQ  =  1.539  944 


3 

Variance-Covariance  Matrix  Times  10 
(Upper  Half  Shown  Only) 


/2274.  -  3861. 

2268. 

602.3 

74.09  - 

3.430  \ 

1  12520. 

-  9536. 

2934. 

397.5 

19.71  \ 

7869. 

-  2535. 

354.2  - 

17.96  j 

\ 

839 . 8  - 

119.7 

6.159  1 

17.31  - 


0.9006 

4.728*10' 
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TABLE  VIII 

INPUT  FOR  ACOUSTIC  AMPLIFICATION  PROBLEM 


a 

e 

a 

f[Hz] 

ef 

P[pa] 

e 

P 

14 

2.1 

500 

20 

1.482*106 

0.08274* 106 

29 

3.5 

1000 

30 

1.482* 106 

0.08274* 106 

57 

3.4 

2000 

50 

1.482* 106 

0.08274*  106 

82 

5.3 

3000 

60 

1.482* 106 

0.08274* 106 

103 

7.2 

4000 

80 

1.482* 106 

0.08274. 106 

122 

9.1 

5000 

100 

1.482* 106 

0.082 74* 106 

133 

10.6 

6000 

120 

1.482* 106 

0.08274* 106 

135 

11.4 

7000 

140 

1.482* 106 

0.08274. 106 

131 

11.8 

8000 

160 

1.482* 106 

0.08274* 106 

125 

11.9 

9000 

180 

1.482* 106 

0.08274* 106 

117 

11.7 

10000 

200 

1.482* 106 

0.08274* 106 

8.5 

1.3 

500 

20 

2. 861* 106 

0. 1241* 106 

18 

2.1 

1000 

30 

2.861* 106 

0. 1241* 106 

37 

2.2 

2000 

50 

2.861*  106 

0.1241* 106 

55 

3.6 

3000 

60 

2.861* 106 

0. 1241* 106 

73 

5.1 

4000 

80 

2.861* 106 

0.1241*106 

88 

6.6 

5000 

100 

2.861* 106 

0.1241* 10^ 

99 

7.9 

6000 

120 

2.861* 106 

0.1241* 10^ 

104 

8.8 

7000 

140 

2.861* 106 

0.1241-106 

105 

9.5 

8000 

160 

2. 861* 106 

0.1241-106 
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TABLE  VIII 
(con't) 


a 

e 

a 

f  [Hz] 

®f 

p[Pa] 

eP 

103 

9.8 

9000 

180 

2.861*106 

0 . 1241  * 106 

101 

10.1 

10000 

200 

2.861* 10^ 

0.1241*106 

13 

1.0 

2000 

50 

5.619  *  10  ^ 

0.2206* 10^ 

24 

1.6 

3000 

60 

5.619 • 106 

0.2206* 10^ 

28 

2.0 

4000 

80 

5.619* 106 

0.2206* 106 
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TABLE  IX 


PARAMETERS  OF  ACOUSTIC  AMPLIFICATION  PROBLEM 


PAR 

A  M  E  T  E  R  S 

Last 

Changes 

Standard 

Errors 

Crude 

Standard 

Errors 

Initial 

Values 

Final  Values 

«n 

40. 

39.854  580  8 

9*10-8 

3. 165 

3.703 

H 

725. 

724.680  299 

2* 10-6 

63.13 

76.64 

1,93* i05 

1*903  796  24* 105 

6* 10-5 

0 . 1200* 105 

0.1206-105 

H 

0.63 

0.634  854  890 

_9 

1.10 

0.03392 

0.04131 

Number  of  Cycles:  8 
W=16.496  783  247  4 
Last  Change  of  W  =  2*10 
k2  =  0.00768 


mQ  =  0.881  111  5 

Variance- Covariance  Matrix 
(Upper  Half  Only) 


10.02 


133.0 

3986. 


1.249-10* 
3.340* 10^ 
1.440* 10° 


0.09830 

1.553 

51.08 

1.151*10 


-3 


Crude  Variance-Covariance  Matrix 


(Upper  Half  Only) 


(13.71  219.1  - 

5873. 


0. 8234* 104 

4.117*105 

1.455-108 


0.1433 

2.606 

35.22 

1.706*10 
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TABLE  X 

DATA  FOR  FITTING  GENERALIZED  CASSINIAN  CURVE 


X 

Y 

X 

Y 

1.0 

5.0 

5.0 

9.0 

2.0 

3.0 

3.0 

8.0 

4.0 

1.0 

1.0 

7.5 

6.0 

1.0 

-1.0 

9.0 

7.0 

3.0 

-3.0 

9.5 

8.0 

6.0 

-4.5 

8.0 

7.5 

8.0 

-4.0 

6.0 

6.5 

9.0 

-2.0 

5.0 

Standard  Errors  of  Observations 
er  =  0.02  (x2  +  y2) 

ey,  =  0.08 
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TABLE  XI 


RESULTS  OF  FITTING  PSEUDO  -  CASS IN I AN  CURVE  TO  CORRELATED  OBSERVATIONS 


PARAMETERS 

Crude 

Initial 

Values 

Final  Values 

Last 

Changes 

Standard 

Errors 

Standard 

Errors 

*1 

-2. 

-3.246  408  5 

-4* 10"8 

0.4386 

0.4472 

yl 

7. 

7.606  215  9 

8-10'9 

-9 

0.1616 

0.3261 

X2 

5. 

5.097  509  9 

2*10 

0.1929 

0.2307 

y2 

4.5 

3.855  190  1 

-9 

7*10 

0.2832 

0.3083 

a 

200. 

437.692  47 

6- 10~6 

48.76 

99.06 

b 

0.25 

0.376  844  61 

-9 

1*10 

0.1324 

0.09642 

Number  of  Cycles:  21 

W  =  3.469  719  340  38 

Last  Change  of  W  *  7*10 

V?  =  1.84-10"3 

in  -  0.586  531  8 
o 


Variance-Covariance  Matrix  (Upper  Half  Only) 


/ 0.1924  -  4.213-10 

[  2.612-10 


-5.487-10"3 

-5.812-10"2 

2.153*10", 

3.820*10":: 

3.720*10 

-3.611-10", 

8.024-10" 

-0.1379 

0.6076 

8.487 

-0.9958 

2.377-10 


-4.736*10 

1.420-10 

6.952*10 

1.613-10 

2.130 

1.753-10 


Crude  Varinace- Covariance  Matrix  (Upper  Half  Only) 


^0.2000 


4.478* 10"2  -6.525*10" 
10.63- 10-2  17.12- 10"3 

5.324-10" 


-5.720-10  , 
-2.015-10" 
-15.57- 10”3 
9.503* 10-2 


-30.94 
20.29 
14.88 
0.6343  _ 

9.814* lO* 


-0.389*10  ‘ 
1.700* 10“2 
10.89- 10-3 
-1.535- 10-2 


5.536 

0.929*10 


-2 
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TABLE  XII 


RESULTS  OF  FITTING  PSEUDO- CASS IN IAN  CURVE 
TO  OBSERVATIONS  WITH  UNIT  WEIGHTS 


P  A 

RAMETERS 

Crude 

Initial 

Values 

Final  Values 

Last 

Changes 

Standard 

Errors 

Standard 

Errors 

X1 

-2. 

-2.887  709  0 

-4* 10~9 

0.8572 

0.3152 

*1 

7. 

6.983  339  1 

1*10“8 

0.1360 

0.2468 

X2 

5. 

5.765  751  0 

-9 

8.10 

0.2297 

0.3351 

y2 

4.5 

4.505  450  5 

_9 

-5*10 

0.4386 

0.3637 

a 

200. 

414.933  17 

-1*10“6 

45.89 

66.01 

b 

0.25 

0.252  214  55 

_9 

-2*10 

0.1792 

0.0580 

Number  of  Cycles:  10 
W  =  2.674  613  584  39 
Last  Change  of  W  =  4*10 
k2  =  5.75-10"14 


m  =  0.516  275  9 
o 


Variance-Covariance  Matrix  (Upper  Half  Only) 


/ 0.7348  -  3.641*10“2  - 

7.694* 10“2 

8. 775* 10“? 

/  1.850*10“  - 

2 .735*  10 “if 

4.034* 10~2 

5.272*10“ 

-  3.376*10“^ 

1.924*10“ 

\ 


-l\ 

1.195  . 

1.462*10  i\ 

8.008*  10_1 

5.392 *10-2  ' 

7.652 

1.963* 10_2 

3.930 

-  2.075*10“ 

2.106*10 

1.250  J 

3.210  *10  “/ 

Crude  Variance-Covariance  Matrix  (Upper  Half  Only) 

/ 0.0993  -  0.261* 10“2  1.384*10“2  -  1.671*10 

/  6.090*10  -  0.428*10 -  1.961*10 

/  5.528*10“  -  0.468*10 

1.323*10 


\ 


13.82 

20.22*10“ 

7.525 

0.0012 

4.353-10 


0.046°  103 
1 . 869 • lO^ 
0.453* id: 
0.553*10 
2.281  2J 
0.336*10  / 
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APPENDIX  A 


Computer  Program  COLSGN 

The  numerical  solution  of  the  least  squares  equations  (22)  and  (23) 
of  Section  2  can  be  implemented  on  a  computer  in  different  ways.  In  this 
Appendix  we  shall  describe  the  implementation  of  the  solution  by  the 
BRL  Applied  Mathematics  Laboratory's  code  COLSGN  (correlated  Least 
Squares  with  General  constraints),  and  give  the  rationale  for  the 
particular  implementation  chosen. 


The  computation  in  COLSGN  are  initiated  by  evaluating  the  following 
equations : 


S-j  T  * 

3  A.  R.  A. 

3  3  3 


(j  —  1,2, r)  , 


V  W)-  V  .  U  =  l>2,...,r), 


(A- 1) 

(A- 2) 


r  T  r 

x  g.B.BT  r  =  X  k.B..  (A- 3) 

j  =  1  3  3  3  j  =  1  3  3 

These  equations  correspond  to  Equations  (20),  (21)  and  (22),  respectively, 

whereby  the  Equations  (21)  for  the  correlates  are  slightly  modified. 

Generally  the  C.  n  0  at  the  beginning,  but  it  was  found  adventageous 

for  some  problems  to  provide  an  option  to  start  the  calculations  with 

estimated  non- zero  values  of  the  C.. 

3 

The  Equations  (A-3)  are  solved  for  t,  the  parameter  vector  T  replaced 
by  T  +  t  in  all  arguments  and  a  new  t  determined  using  Equations  (A-l) 
(A-2)  and  (A-3).  In  other  words.  Equations  (A-J) ,  (A-2)  and  (A-3)  are 
used  for  iteration  of  the  parameters,  keeping  the  residuals  fixed. 

After  that  iteration  has  come  to  a  halt  (iteration  end  conditions 
will  be  discussed  later),  the  residuals  are  computed  by  a  modified 
Equation  (23),  namely. 
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(A- 4) 


Cj  *  £j  =kj  Rj  Aj  «  =  1’2’"--r) 

The  Cj  are  then  replaced  by  in  all  arguments  and  new 

computed  using  the  Equations  (A-l),  (A-2)  and  (A-4).  Thus,  during 
this  phase  of  computations,  the  residuals  are  iterated,  keeping  thereby 
the  parameters  fixed.  The  iteration  is  continued  until  an  end  criterion 
is  satisfied.  This  completes  then  one  iteration  cycle.  The  next  cycle 
is  started  by  a  new  computation  of  the  parameters  using  the  new 
values  of  and  Equations  (A-l),  (A-2)  and  (A-3). 

Let  the  three  iterations  which  are  involved  in  the  computing 
process  be  called  "parameter  iteration",  "residual  iteration"  and  "cycle 
iteration",  respectively.  The  end  conditions  for  these  iterations  are  purely 
numerical  in  the  COLSGN  code  and,  therefore,  arbitrary  to  a  certain  degree. 
They  are  established  mainly  by  numerical  experiments. 


One  obvious  criterion  for  the  cycle  convergence  is  that  the  weighted  sum 


of  the  residual  squares,  namely, 

r 

W  =  2 
j  =  l 


T  -1 

c!  R.  c. 

1  3  3 


(A- 5) 


becomes  stationary.  Such  a  condition  alone,  however,  is  not  suitable 
for  numerical  purposes,  because  in  the  vicinity  of  a  minimum  W  is 
little  influenced  by  small  changes  of  the  variables  (C^  and  T) .  As  a 
second  criterion,  therefore,  a  requirement  was  introduced  that  the  changes 
of  the  components  of  the  parameter  vector  T  between  cycles  become 

-7 

smaller  than  a  threshold.  TTiat  threshold  was  set  arbitrarily  at  10 
of  a  crude  estimate  of  the  corresponding  standard  error.  Particularly, 
COLSGN  stops  the  cycle  iteration,  if  all  6f  the  following  conditions 
are  satisfied: 

|aw|  <  io"12 
»-7 


m  W 
o 


I ATj |  <  10"'  *  mQ  /Q  ,  j  =  l,...,p 


(A-6) 
(A- 7) 


where  AW  and  AT.  are  the  changes  of  W  and  T.,  respectively,  by  the  last 
2  3  3 

cycle,  mQ  is  the  variance  of  weight  one  (see  Equation  (56) ,  Section  3) 

and  Qjj  is  a  diagonal  element  of  the  matrix  inverse  to  the  normal  equation 
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matrix  in  Equation  (A-3) .  As  pointed  out  in  Sections  3  and  4,  m0/Q77 

are  poor  estimates  for  the  standard  errors  of  parameters.  Nevertheless 

they  are  used  by  COLSGN  in  its  end  conditions  because  the  Q..  are 

33 

readily  available  during  the  iterations  and  because  an  error  even  by 
an  order  of  magnitude  would  not  have  serious  influence  on  the  results. 

The  conditions  (A-7)  are  safer  to  apply  than  for  instance,  the 

requirements  that  the  relative  values  |AT\/T\|  be  less  than  a  threshold, 

because  some  of  the  parameters  might  approach  zero  at  convergence. 

13 

(Such  convergence  criteria  are  used  by  Powell  §  Macdonald  ).  In  all 
test  examples  computed, the  conditions  (A-7)  were  more  stringent  that 
(A-6) ,  the  latter  being  satisfied  several  cycles  before  (A-7). 

The  iteration  end  criterion  for  the  parameter  iteration  should 
merely  insure  that  the  nonlinear  effedts  of  the  constraint  functional 
are  reduced  by  the  iteration.  This  is  the  case,  if  the  correction  t 
of  T  is  "sufficiently"  small.  During  the  first  cycles,  on  the  other  hand, 
the  end  condition  for  the  parameter  iteration  should  not  be  too 
stringent,  because  the  parameter  values  computed  during  theses  cycles 
are  only  preliminary  approximations.  They  will  be  corrected  anyway  in 
each  subsequent  cycle  after  the  computation  of  new  residuals.  Finally, 
the  "smallness"  of  t  should  be  in  some  relation  to  the  present  accuracy 
of  W. 


With  these  considerations  in  mind,  the  end  condition  for  the  parameter 
iteration  was  formulated  as  follows : 

r 


with 


tT  .  2  k.B.  <  10-a.  in  .  W 

j  j  o  parameters 


r  kt 

W  =2  — 

parameters  ^  g. 

3  3 


(A- 8) 

(A-9) 


and 


a  <  min  (12,  Cycle  Nr  +2). 


(A- 10) 
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The  factor  10  in  (A- 8)  takes  care  of  an  increase  in  the  accuracy 
requirement  as  the  cycle  number  increases.  ^parameter  i®  according  to 
Section  3  an  approximation  to  W.  (W  cannot  be  computed  by  Equation 
(A-5)  during  the  parameter  iteration,  because  the  do  not  change 
during  that  iteration.)  The  left  hand  side  of  Equation  (A-8)  is 
roughly  proportional  to  the  change  of  Wparameter  due  to  the  change  x 
of  the  parameter  T.  Hence,  (A-8)  requires  that  an  estimate  of  the 
relative  change  of  W  due  to  x  should  be  less  than  a  threshold. 


The  above  mentioned  relation  between  x  and  W  .  can  be 

parameter 

derived  as  follows.  Neglecting  the  changes  of  A^  (and,  therefore,  those 
of  g.)  due  to  x  we  obtain  from  (A-2)  for  the  changes  Ak .  of  k . , 


'J 


J 


3 


A  k .  *  -g  .  x  B . . 
3  6 3  3 


(A- 11) 


Substituting  (A-ll)  into  (A-9)  we  obtain, 

2  v2 

T 


(k.  +  Ak.)' 

_J _ L 


g 


3 


"  2  /  -  2* 


2  k.B. 
3  3 


T  2 


where  the  first  term  on  the  right  hand  side  is  twice  the  left  hand  side 

of  the  condition  (A-8),  with  negative  sign.  Incidentilly  the  expression 

T  T 

x  £  kjBj  is,  because  of  Equation  (A- 3) ,  equal  to  x  Nx,  where  N  is  the 

positive  definite  normal  equation  matrix.  Hence  that  expression  can  be 

considered  as  the  square  of  a  norm  of  t.  Therefore,  the  condition  (A-8) 

can  also  be  interpreted  as  a  condition  for  a  norm  of  x. 

As  the  cycle  number  increases  the  condition  (A-8)  becomes  more 
stringent.  On  the  other  hand,  the  corrections  become  smaller  and, 
therefore,  the  linearized  constraint  functional  an  ever  better  approxi¬ 
mation.  Usually  after  a  few  initial  cycles  the  condition  (A-8)  is  satisfied 
by  the  first  parameter  iteration  step. 

The  initial  guess  T  of  the  parameters  can  be  quite  different  from 
the  least  squares  solution  t.  Therefore,  COLSGN  carries  out  in  the 
first  cycle  at  least  three  parameter  iterations,  regardless  whether 
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(A-8)  is  satisfied  or  not.  Experiments  show  that  more  than  three 
parameter  iterations  are  necessary  in  any  cycle  only  if  the  current 
approximation  T  is  very  bad.  In  those  cases  the  system  (A-l),  (A-2)  and 
(A-3)  may  even  fail  to  converge  at  all.  Convergence  can  then  often 
be  achieved,  if  the  weights  gj  are  kept  constant  during  the  iterations. 
Therefore,  the  weights  of  the  third  parameter  iteration  are  used 
for  any  futher  iteration  within  each  cycle.  (Only  (A-2)  and  (A-3) 
are  used  for  the  4th,  5th  etc.  parameter  iterations  in  each  cycle.) 

For  the  residual  iteration  similar  considerations  hold  as  for  the 
parameter  iteration.  Because  for  this  iteration,  W  can  be  computed 
by  Equation  (A-5) ,  also  its  change  due  to  the  can  be  computed 
directly.  The  corresponding  formula  is, 

r  T  - 1 

AW  =  £  e.  R.  (2C.  +  e.).  (A-13) 

e  j=1  3  3  3  3 

The  iteration  end  condition  for  the  residual  iteration  is  accordingly 

|AWj  <  10"a  mQ  •  W  ,  (A- 14) 

where  a  is  defined  by  (A- 10).  Usually  after  a  few  initial  cycles  this 

condition,  too,  is  satisfied  by  the  first  iteration  step. 

Estimates  of  the  variances  and  covariances  of  the  parameters  are 
computed  after  completed  iteration  by  a  straight  forward  evaluation  of 
the  formulas  of  Section  3. 
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